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Purpose 


GeneLab is currently being developed by NASA to accelerate “open science” biomedical research in support of the 
human exploration of space and the improvement of life on earth.” Phase I of the four-phase GeneLab Data Systems 
(GLDS) project emphasized capabilities for submission, curation, search, and retrieval of genomics, transcriptomics 
and proteomics (“omics”) data from biomedical research of space environments. The focus of development of the 
GLDS for Phase II has been federated data search for and retrieval of these kinds of data across other open-access 
systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such 
meta-investigations are key to corroborating findings from many kinds of assays and translating them into systems 
biology knowledge and, eventually, therapeutics. 


System Design 


The GLDS currently serves over 100 omics investigations oS 

to the biomedical community via open access. In order to gnaw 

expand the scope of metadata record searches via the 
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RAST). Each of these systems defines metadata for omics 

data sets differently. One solution to bridge such 
differences is to employ a common object model (COM) to 
which each systems’ representation of metadata can be 
mapped. Warehoused metadata records are then 
transformed at ETL to this single, common representation. 
Queries generated via the GLDS are then executed against 
the warehouse, and matching records are shown in the 
COM representation (Fig. 1). While this approach is 
relatively straightforward to implement, the volume of the 
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with latency and currency of records. Furthermore, the lack 

of a coordinated, universal registry of these kinds of data creates the issue of data duplication in federated search 
systems. 
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Results 


Prototype federated data search capabilities are currently accessible to internal (NASA) users, with open access to 
these capabilities anticipated in the GLDS no later than Sep. 2017. We will demonstrate the flexibility, performance, 
and power of our metadata representation mapping approach. The execution of these kinds of studies will be furthered 
in Phases II and IV through the development of collaborative omics meta-analysis workspaces. 
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